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ABSTRACT 

Psychology studies often nave low statistical power. 
Sample size tables, as given by J. Cohen (1988), may be used to 
increase power, but they are based on Monte Carlo studies of 
relatively "tame* mathematical distributions, as compared to 
psychology data sets. In this study, Monte Carlo methods were used to 
investigate Type I and Type II error properties of the independent 
samples "t" test under a "discrete mass at zero with gap" data set to 
determine if the sample size tables given by Cohen yield correct 
results. Monte Carlo methods were used with a FORTRAN program to 
sample with replacement from a population of 515 responses to a 
survey regarding the age at which subjects first used cigarettes. Ten 
sample sizes were randomly drawn: (1) nl=5, n2=15; (2) nl=lD, n2=10; 
(3) nl=10, n2=30; (4) nl=20, n2=20? (5) nl=15, n2=45; (6) nl=30, 
n2=30; (7) nl=20, n2=60; (8) nl=:40, n2=40; (9) nl=30, n2=90> and (10) 
nl=50, n2=60. For the smallest unbalanced sample size (5,15), the "t" 
test was generally not robust. For the remaining sample sizes, 
results were in agreement with normal curve theory. When confronted 
With non-normal data sets, psychology researchers do not need to make 
any modifications to Cohen's (1988) tcOsles when making sample size 
determinations. Two graphs illustrate the study. A 12-itero list of 
references is included. (SLD) 
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ABSTRACT 

Psychology studies often have low statistical power. Sample size tables (Cohen, 1988) 
may be used to increase power, but they are based on Monte Carlo studies of relatively "tame" 
mathematical distributions, as compared to psychology data sets. A prevalent psychometric 
measure distribution, the ''discrete mass at zero with gap", occurs with -first use" variables. The 
Type II error properties of the independent samples i test on real psychometric data of this type 
were in agreement with normal curve theory. Thus, in making sample size determinations, 
psychology researchers do not need to make any modifications for this common distribution. 



SAMPLE SIZE TABLES, i TEST. AND 
A PREVALENT PSYCHOMETRIC DISTRIBUTION 



In an analysis of 85 published psychotherapy outcome studies from J984-I986, Kazdin 
and Bass (1989) found the power to detect differences between two or more treatments was 
weak. Similarly, Rossi (1990) calculated power for 6,155 statistical tests performed in psychologi- 
cal research published in 1982 and concluded the power was very low. A simple method to maxi- 
mize statistical power is through sample size determinations, such as through tables given by 
Cohen (1988). 

A serious question has arisen, however, that may directly effect sample size calculations. 
The question pertains to the robustness of parametric statistics such as the t and F tests to depar- 
tures from population normality. Monte Carlo studies by Boneau (1960, 1962) and Glass, Peck- 
ham and Sanders (1972), and the text by Scheffe (1959) are often cited as evidence of the 
robustness of these tests. Bradley (1968, 1977, 1978. 1982) countered that their simulation 
studies were limited to the investigation of mathematically convenient distributions that are rela- 
tively "tame" in comparison to distributions obtained in psychology. 

Micceri (1986, 1989) obtained 440 large sample data sets from research published in 
1982-1984 that dramatically underscores Bradley's concerns. One hundred and twenty-five of 
the 440 large sample data sets were obtained from psychometric measures, including Minnesota 
Multiphasic Personality Inventory scales, Mallory test of visual hallucinations, and a variety of 
measures of anger, anxiety, curiosity, locus of control, masculinity/femininity, satisfaction, 
sociability, etc. A mere 3% of these data sets were considered near Gaussian. Micceri (1989) con- 
cluded that studies by Boneau and others of the robustness of the I and F tests under smooth 
mathematical curves were inconclusive, because "none of these comparisons occurs in real life" 
(p. 164). 

Figure 1 is a histogram depicting a prevalent psychometric measure distribution 
described by Micceri as "discrete mass at zero with gap". This example represents the responses 
of 516 adolescents to a survey question asking their age when they first began to smoke 
cigarettes. Distributions such as these typically occur with "first use" or "onset" variables in 
psychological research. Researchers may only be interested in responses above zero. Consider the 
two paradigms emerging in recent substance use literature: "User/Abuser" and 
"Nonuser/User/ Abuser". For studies such as Long and Scherl (1984) and Hillman and 
Sawilowsky (1991), the percent of nonusers would be reported, but hypotheses (such as compar- 
ing means) would be tested only on user and abuser categories. Although this part of the curve 
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is a bit flatter than the normal curve, little doubt is raised regarding Type II error properties. 
For the latter case, however, such as Shedler and Block (1990) who studied "Abstainers, Ex- 
perimenters, and Frequent Users", it would be appropriate to compare means for all three 
groups even though the nonuser scores are discrete zeros. Figure 1 probably depicts a population 
shape unimagined by psychology researchers who think in terms of normal curve theory. 

PURPOSE OF THE STUDY 
The issue raised by Micceri applies equally to T\pe II error (or power). Cohen (1988) 
relied primarily on Scheffe (1959) and Boneau (I960, 1962) in preparing sample size tables. Be- 
cause these studies used mathematically convenient and tame distributions, the issues raised by 
Bradley (1968) and Micceri (1989) bring into question the validity of the tables. The purpose of 
this study, then, is to use Monte Carlo methods to investigate the Type I and Type II error 
properties of the independent samples t lest under a "discrete mass at zero with gap" data set to 
determine if sample size tables in Cohen (1988) yield correct results. 

METHODOLOGY 

Monte Carlo methods were used with a FORTRAN program to sample with replacement 
from a population of 516 responses to a survey question regarding the age of first use of 
cigarettes. Scores were also sampled from a Gaussian distribution to demonstrate the adequacy 
of the simulation. Random responses were drawn for sample sixes (nj,n2) «= (5.15), (10.10). 
(10,30). '.20.20). (15,45), (30,30), (20,60). (40.40). (30.90). and (60,60). 

The Type I error rates were obtained by computing the 1 statistic on each sample pair 
with ten thousand (10.000) repetitions performed for the .10, .05, and .01 alpha levels. The 
robustness of the independent samples l test with respect to Type 11 error was investigated as fol- 
lows: Let Xjj and be observations in two random samples with mean /i| and standard devia- 
tion ffj. Transformed variables called X jj and X ^j were generated by 

♦ 

(1) ^ li = X|j - /ij i » 1 nj 

(2) X 2j = c (Xjj - Mj) + k £Tj j = l n2 

where c and k are constants. 

Hypotheses of shift in location parameters were investigated by making c = 1 and k 
equal to a constant of .2o, .5a, .Bo, and 1 .2cf, where a represents the standard deviation of the 
distribution, to X2j: Hypotheses of shift in mean plus increases in variance were investigated by 



using the same values of k, but c was made equal to v^2. v^3. and 2. Unbalanced layouts were 
not investigated for c * 1, as Cohen (1988) noted that for both unequal variances and unequal 
sample sizes the tabled power values "may be greatly in error" (p. 44). 

RESULTS AND DISCUSSION 
For the smallest unbalanced sample size (5,15) the 1 test was generally not robust. With 
nominal alpha at .10, the lower tail rejected only .023 and the upper tail «as .050; with alpha at 
.05. the lower tail rejected only .009, while the upper tail was slightly liberal at .029; and at the 
.01 alpha level the lower tail was conservative at .001 and the upper tail was liberal at .007. 
However, for the remaining sample sizes, results were in agreement with normal curve theory. 
As noted in Figure 2, the power curves for the nonnormal data set for sample size (5,15) indi- 
cated a slight power loss of.Ol. At the larger sample sizes the power was virtually identical to 
that expected under normal curve theory. Thus, when confronted with nonnormal data sets such 
as this, psychology researchers need not make any modifications to Cohen's (1988) tables when 
making sample size determinations. 
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